Marcin Stepniak
Version Control Systems, Git and GitHub
Application of git in RStudio IDE;
Application of git in GitHub desktop.
Git is a distributed version control system
each client (user) has its own local copy
=> local repository
which can be synchronized with a copy stored on external server
=> remote repository
GitHub is a commercial Git repository hosting service
with some extra functionalities:
cloud-based service
web-based graphical interface
issue tracker
basic project management functionalities, access granting etc.
Extended free plan:
track changes: always have the most recent version of a file, while keeping the entire history of changes
experiment: you can experiment with your code
and revert changes if needed
multiple instances: you can have several version of your code
and easily switch between them
in case of using a remote repository (e.g. GitHub):
README.md) fileremotes::install_github("r-lib/remotes") )Repository is like a main folder of your project.
It may contain any type of files, (sub)folders etc.
Golden rule: one project one repository.
README file: not required but highly recommended.
Describes the repo, may contain documentation, use-cases or
any other information you think may be useful for a potential user (including yourself!)
Go to your GitHub account and select New repository
Should be a character string (without space)
No official naming convention, but hyphens are mostly used to separate words
Great repository names are short and memorable.Besides repository name you can:
add a brief description (read: you should)
create README.md file (yes, you want to do that)
private repository is visible only for you or anyone you grant an access to it. It is a good solution for:
.gitignore contains a list of all the files that should not be included in the repo.
It’s highly recommended to inlcude .gitignore into your repository.
GitHub offers a list of .gitignore templates.
Select the one which fits you best.
README and .gitignore (R)Clone creates a local copy of a remote repository
It contains all project’s files and full project’s history.
Done!
Use .gitignore for all those files that should not be taken into account by Git so they are:
excluded from the version control and not tracked by Git
not inlcuded to any commit
not pushed to remote repository
not shared with colaborators / published
They only reside on your local machine.
Anything we don’t want or don’t need to keep track of
user-specific files / settings (e.g. MyProjectName.Rproj)
notes & drafts
data, in particular large files
outputs of our code: tables, figures, html rendered from Rmarkdown files, etc.
Any personal / confidential data:
Anything after # is ignored
*.Rmd ingores all .Rmd files (* replace any character(s)
!main.Rmd states that main.Rmd should be tracked (not ingored)
test.R ignores test.R file
/test.R ignores test.R file in a folder where .gitignore is located
test/test.R ignores test.R file in a test folder
test/ ignores all files in a test folder
Tip: use templates, add manually what you need.
.gitignore fileadd a comment my files and include:
.Rproj typetemporary foldersave .gitignore
revise list of files in your git pane
copy lines added to .gitignore and send me through chat (directly to me!)
Commit is a snapshot of the repository.
Commits are cheap.
Commit often.
Each commit has its own identifier.
You can:
+ have an access to the repository at a given stage
+ revise changes made by a particular commit
You can even compare binary files!
Use Display the rich diff
Each commit must contain a message.
Add useful, self-explanatory messages.
It’s all about syncing your local and remote repositories.
Push chagnes: sends recent commit(s) from your local repository to the remote one.
Pull repository: downloads recent commits from remote repository and update your local repository
Tip 1: Pull before push (sooner or later, this habit helps to keep your mental health).
Tip 2: Push often. Commit even more often.
Every time you push, you are making a cloud backup of your project.
Extra point: you get more pit stops you can refer to.
commit.gitignoreCommitted: form part of the repo in the current form.
Git keeps their most recent state and full history of changes.
Untracked: git sees them but has no clue what’s your point, yet.
Staged: their current form is frozen and can be added to commit.
In case of further changes, you need unstage and stage it again.
Ignored: git doesn’t see nor it’s interested whether they are changed or not.
You need to explicitly inform git to ignore a file.
Added: new file added to repository.
Untracked file.
Deleted file: it is not available in repository anymore
important: the file can be found when browsing commits / history of repository.
Modified: file has been changed since the last commit
Select which changes (modify / add / delete) should be added to the commit (i.e. which changes will be saved together with the commit)
Unselected changes will be available only in the particular evnironment, but they cannot be pushed to the remote repository.
The file(s) or their recent changes are not untracked.
Once you want to commit changes, you are taken to the new window.
You can revise changes file by file and decide which of them you want to add to the commit.
Finally, you have to add a commit message. Otherwise you get the following error:
Aborting commit due to empty commit message.
Once you finalize your commit, you can push your commit(s) to the remote repository.
After a while, commit becomes alive in your github repository
(you need to refresh your browser).
If you stage a file and then you make some extra changes, they will not be added to the commit.
In order to include them in the current commit, you need to
unstage the file and stage it again.
example_code.R file to your project folder.temporary folderexercice_description.R to the temporary folderexample_code.R fileexample_code.R file in RStudioexercice_description.R file in RStudioexercice_description.R.example_code.R by copied linesmy_first_plot <- ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
Conflict appears when different changes have been made to the same file since the last commit.
Conflicts have to be resolved.
Open README file directly from your github remote repository:
Edit the file, and commit changes:
WITHOUT pulling changes, open README.md file in RStudio
and modify the file (as you wish, but differently than you have done via web).
Save the file, stage it and add to the commit.
Commit changes and Push them to the remote repository
Congratulations! You have provoked your first git conflict.
Bad news: you need to somehow resolve it.
<<<<<<< and >>>>>>> followed by the commit id)<<<<<<<, =======, >>>>>>>)Resolved!
Branch is a parallel version of a repository.
The main branch of the repository is a master branch.
By creating a new branch you duplicate a current state of the repository (all files, commits etc.).
Source: https://docs.github.com/assets/images/help/branches/pr-retargeting-diagram1.png
You can create a new branch via GitHub.com web:
Name convention for branches is similar to the repo one (e.g. this-is-my-new-branch)
Once you’ve created a new branch, it appears in your local repo,
just pull the remote one.
Now, you can check out to this new branch.
Any further commit will be added to this new branch, while master will be untouched.
You can create a new branch directly from RStudio:
Select a name for the new branch.
Tip: Add remote, preferably using the same name as in your local repository. The new branch appears also in your remote repository:
example_code.R with lines 31-33 from exercice_description.ROnce you are ready to incorporate the changes, you can merge them into your master branch.
It means, that all commits from a given branch will be inlcuded into your master branch.
On GitHub.com you can compare two branches.
To do so you can either:
/compare to your repository pathhttps://github.com/stmarcin/github-workshop/compare) or
Compare & pull request and scroll down.Pull request opens a discussion on proposed changes.
By filling a pull request you are asking a repo owner to pull your changes to the repository.
Note: you can ask yourself (as repo owner) as well!
Submitted changes are to be reviewed and either accepted or rejected by a repository’s owner.
Once you confirm merge, you can delete an old branch.
Go back to RStudio. Check out to master and click pull
>>> C:/Program Files/Git/bin/git.exe pull
From https://github.com/stmarcin/test_workshop
2387ab3..4e6e32c master -> origin/master
Updating 2387ab3..4e6e32c
Fast-forward
README.md | 1 +
example_code.R | 5 +++--
2 files changed, 4 insertions(+), 2 deletions(-)
The branch is deleted from the remote repository, but it is still visible in RStudio pop-up menu:
In order to remove it you need to:
git fetch -p
git branch -d name-of-your-branch
masterFork a repository makes a personal copy of another user’s repository.
All commits you made are pushed to your copy of the original repository without afecting an original one.
You can establish a link to an original repo, keep your fork synchronized and make a pull request to an upstream repo.
Once you fork a repository you need to clone it to the local repo.
Once it is cloned you have to set a link to the original one.
Open Shell and write the following:
git remote add upstream https://github.com/stmarcin/repo-for-workshop-tgis
Create a new branch and check-out to it.
Add/edit files, stage them and commit changes.
Push changes to your remote repository.
Create a Pull request
GitHub issues serve to communicate with users (including yourself).
issue has its own id which can be then refered to and contains its own discussion thread.#) in your commit messages, e.g. Solve #3@mention another user so they are informed about the issue.https://github.com/stmarcin/repo-for-workshop-tgis and clone it to your local settings.@mention me!new-branch-NumberOfYourFile)Go to File -> New repository (or Ctrl + N)
For the sake of your mental health:
use the same name for both, local and remote repository!
You can just simply copy a file to your repo’s folder:
GitHub desktop offers default commit messages
(such sophisticated as: Create file.txt or Update file.txt)
Once you commit you’ll see, that your changes are not pushed to your remote repo (1).
You need to push commits manually (2)
In order to create a new branch go to:
Branch -> New Branch (or Ctrl + Shift + N):
Once you have created a new branch, confirm which branch you are currently on:
I have accidentally commit to master branch instead of the my-first-new-branch.
GitHub Desktop allows you to Update your new branch from master.
Go to Repository -> Create issue on GitHub (or Ctrl + i)
You can directly refer to the issue writing a commit message:
Select a file in Changes pane so you can:
You need to check out to the branch you want to merge (e.g. master)
When it’s done, you can delete a branch:
issuemy-new-branch and check out to itmy-new-branch to master and delete it.Do you already have R project and need to create GitHub repo for it?
No problem!
The easiest way to do it (no configuration needed):
GitHub (remember to include .gitignore);.gitignore)Done!
{usethis} package{usethis} packageinstall.packages("usethis")
usethis::use_git()
Commit. Add unnecessary files to .gitignore.
Create a new repo on GitHub as a remote to your local one
usethis::use_github()
Note 1: use use_github() help to check its parameters ( ?usethis::use_github() ).
Note 2: in case of problems, check relevant chapter of Happy Git and GitHub for the useR
repository .gitignore
local .gitignore
Git paneYou can select which files you want to add to .gitignore
In the next step you can select where .gitignore will be located
and add whichever files you want, even those which do not yet exist
{usethis} packageuse_git_ignore() tells git to ignore particular file(s)usethis::use_git_ignore(ignores, directory = "."
edit_git_ignore() opens .gitignore file so you can edit it manually.usethis::edit_git_ignore()